CarveML: application of machine learning to file fragment classification

نویسنده

  • Andrew Duffy
چکیده

We present a learning algorithmic approach to the problem of recognzing the file types of file fragments, with the purpose of applying this to “file carving”, the reconstruction of partially erased files on disk into whole files. We do so through the use of 257 calculated features of an input fragment, applying the Support Vector Machine, Multinomial Naive Bayes, and Linear Discriminant Analysis models to our problem to see which produces the most accurate method of classification.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine Learning and Citizen Science: Opportunities and Challenges of Human-Computer Interaction

Background and Aim: In processing large data, scientists have to perform the tedious task of analyzing hefty bulk of data. Machine learning techniques are a potential solution to this problem. In citizen science, human and artificial intelligence may be unified to facilitate this effort. Considering the ambiguities in machine performance and management of user-generated data, this paper aims to...

متن کامل

A File Fragment Classification Method Based on Grayscale Image

File fragment classification is an important and difficult problem in digital forensics. Previous works in this area mainly relied on specific byte sequences in file headers and footers, or statistical analysis and machine learning algorithms on data from the middle of the file. This paper introduces a new approach to classify file fragment based on grayscale image. The proposed method treats a...

متن کامل

APPLICATION OF THE HYBRID HARMONY SEARCH WITH SUPPORT VECTOR MACHINE FOR IDENTIFICATION AND CALSSIFICATION OF DAMAGED ZONE AROUND UNDERGROUND SPACES

An excavation damage zone (EDZ) can be defined as a rock zone where the rock properties and conditions have been changed due to the processes related to an excavation. This zone affects the behavior of rock mass surrounding the construction that reduces the stability and safety factor and increase probability of failure of the structure. This paper presents an approach to build a model for the ...

متن کامل

A Hybrid Machine Learning Method for Intrusion Detection

Data security is an important area of concern for every computer system owner. An intrusion detection system is a device or software application that monitors a network or systems for malicious activity or policy violations. Already various techniques of artificial intelligence have been used for intrusion detection. The main challenge in this area is the running speed of the available implemen...

متن کامل

Fast Content-Based File Type Identification

Digital forensic examiners often need to identify the type of a file or file fragment based on the content of the file. Content-based file type identification schemes typically use a byte frequency distribution with statistical machine learning to classify file types. Most algorithms analyze the entire file content to obtain the byte frequency distribution, a technique that is inefficient and t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014